Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech
نویسندگان
چکیده
Accuracy of automatic speaker recognition (ASV) systems degrades severely in the presence of background noise. In this paper, we study the use of additional side information provided by a body-conducted sensor, throat microphone. Throat microphone signal is much less affected by background noise in comparison to acoustic microphone signal. This makes throat microphones potentially useful for feature extraction or speech activity detection. This paper, firstly, proposes a new prototype system for simultaneous data-acquisition of acoustic and throat microphone signals. Secondly, we study the use of this additional information for both speech activity detection, feature extraction and fusion of the acoustic and throat microphone signals. We collect a pilot database consisting of 38 subjects including both clean and noisy sessions. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based system. We have achieved considerable improvement in recognition accuracy even in highly degraded conditions.
منابع مشابه
Throat Microphone for Speaker Recognition Using AANN
In this paper, we have analyzed the performance of speaker recognition system based on features extracted from the speech recorded using throat microphone in clean and noisy environment. In general, clean speech performs better for speaker recognition system. Speaker recognition in noisy environment, using transducer held at the throat results in a signal that is clean even in noisy. This speak...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملFeature vector normalization with combined standard and throat microphones for robust ASR
We propose on-line unsupervised compensation technique for robust speech recognition that combines standard and throat microphone feature vectors. The solution, called MultiEnvironment Model-based LInear Normalization with Throat microphone information, MEMLINT, is an extension of MEMLIN formulation. Hence, standard microphone noisy space and throat microphone space are modelled as GMMs and a s...
متن کاملIncreasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays
In this paper we describe the additive robustness obtained through the combined use of a first acoustic processing step based on a low complexity microphone array, followed by a spectral normalization step. Microphone arrays have shown to provide good results in reducing different sources of acoustic degradation. However, microphone arrays produce linear filtering effects that need to be compen...
متن کاملThroat microphone signal for speaker recognition
Speaker recognition systems perform better when clean speech signals are used for the task. In the presence of high levels of background noise, speech recorded from a close speaking microphone will be degraded and hence the performance of the speaker recognition system. Use of a transducer held at the throat results in a signal that is clean even in a noisy environment. This paper discusses the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016